Researchers tend to co-author with colleges with similar research interests, while they predominately only cite those with differing interests (Ding, 2011). In this report we test the effect on network structure when various co-authorship propensities are altered. We start with a null model resembling Newman’s astrophysics co-authorship network (2004), and vary the degree to which authors are willing to coauthor with others who have dissimilar interests, to see how this affects macro-level network properties.
By altering the propensity of simulated researchers to match with co-authors of dissimilar interests, we can model the effect on the overall disconnectedness of a particular field, and the role individuals with differing distributions of expertise have on that structure. In doing so, we explore and make quantifiable the possible benefits of higher cross-expertise collaboration in the academic sphere.
Individuals have some agency regarding who they choose to associate with. While the utility of numerous weak ties has been explored (Granovetter, 1977), individuals will likely invest more heavily in some relationships over others. The selection of these close associates can be based on dyadic properties such as homophily (McPherson, Smith-Lovin, & Cook, 2001), necessity of resources (Lee, 2014), or based on the potential associates’ position in a broader social network (Buechel & Buskens, 2008; Jackson & Wolinsky, 1996). Regardless of reason, trusting relationships are understood to facilitate the transmission of resources between individuals (Coleman, 1988).
One example of this exists within the co-authorship networks of academia. Co-authorship networks differ structurally across disciplines (Newman, 2004), thus the ability for social resources to flow across these networks will also differ. It has been shown that strong, direct ties between collaborators produces higher quality research, while working with colleges with similar research interests only influences research output (Gonzalez-Brambila, Veloso, & Krackhardt, 2013). Further, betweenness centrality has shown to be a particularly good predictor of preferential treatment in a co-authorship network, as new entrants to the network approach these brokers when making their first connections (Abbasi, Hossain, & Leydesdorff, 2012). We examine the effects on these metrics using the following model.
We specify the following parameters in out model: * The number of authors in the model (\(N\)), * The number of iterations the model will run (\(T\)), * The starting threshold \(\sigma_0\) each agent uses to decide if another agent is similar enough in interests to co-author with (so that the individual thresholds \(\sigma_i\) for each agent \(i\) are equal to \(\sigma_0\) at the start of the model), * A decay parameter \(\gamma\) indicating how quickly agents will seek out agents who are less similar in interests, * The maximum number of co-authors for any one agent (\(c_{max}\)), * The max number of times an agent can get rejected by someone before giving up on forming a link with them (\(r_{max}\)), and * The rate parameter \(\lambda\) of the exponential distribution used to determine the talent level in our 5 sub-topics, producing a talent vector \(\beta_i\) for each author \(i\).
The model proceeds as follows. At a given time \(t\), we randomly allow each node \(p\) to initiate a connection with another node \(q\). If a node already has a partner, it is not eligible to be paired with. At time \(t + 1\), each of these connections are tested for affinity (trust). If the similarity of \(q\)’s interests to \(p\) (measured as \(1-JS(\beta_p,\beta_q)\), where \(JS(\dot)\) is the Jensen-Shannon distance) are above \(p\)’s similarity threshold \(\sigma_p\), and if \(q\) has not already formed \(c_{max}\) other connections, then this connection is solidified as a trusting relationship and an edge is formed between \(p\) and \(q\). Otherwise, the connection is dropped, and \(p\) ``lowers their standards’’ by decaying their threshold \(\sigma_p\) by a factor of \(\gamma\).
Here we see what happens if agents become more willing to co-author with people with different specialties, seperated by the ‘talent’ of the agent. To construct the talent metric for a given author \(p\), we compute the entropy of their talent vector \(H(\beta_p)\) and then multiply this entropy by the ‘total talent’ \(\| \beta_p \|_1 = \sum_{i}(\beta_p)_i\), to produce a ‘talent-weighted-diversity’ measure \(TWD(p) = \| \beta_p \|_1H(\beta_p)\). We then plot the betweenness centrality of the authors with the lowest, highest, and median TWD scores, for varying levels of ‘expectation lowering rates’ \(\gamma \in \{0.00, 0.01, \dots, 0.19, 0.20\}\). Following the literature, we would expect the most ‘effective’ coauthorship networks to be those in which the high-TWD authors serve as ‘connectors’ between subfield clusters. And thus we can read a preliminary result from the plot: that the ‘optimal’ rate of ‘expectation lowering’ – the rate which best facilitates high-TWD authors serving as connectors – is approximately \(\gamma = 0.11\), i.e., an \(11\%\) decay after each failed connection.
After calibration of the model with the astrophysics coauthorship network, however, we see that astrophysicists essentially exhibit overly-high ‘standards’, failing to lower their similarity thresholds at a rate which would best allow the emergence of inter-subfield connectors. Expanding from calibration of the toy model with \(N = 9\) to calibrating larger coauthorship networks (\(N \in \{100, 200, 400, 800\}\)), we find that this result continues to hold: the astrophysics coauthorship network ‘matches’ with networks generated by \(\gamma \in [0.05,0.09]\), while the optimal \(\gamma\) for uncalibrated networks remains almost identical at \(0.11\).
!!!
We see in the decay parameter graph that the parameter closest to the life network is not the most benificial for any agent (though it is moderatly helpful for those of average talent). Out model indicates that if researcher were to be just slightly more willing to reach outside their specific interest and sub-fields, brokerage between topics will increase for academics across the spectrum of talent. This increase in brokerage would not only help ideas travel between cliques in the larger network, but may increase the quality of the work done.
In creating this model, we showcase how small changes in the co-authorship behavior of researchers can greatly impact the structure of a field. Given that the type and quality of co-authorship ties have been shown to impact research quality and output, knowing what co-authorship behaviors to encourage can help enhance the scientific pursuit.
Abbasi, A., Hossain, L., & Leydesdorff, L. (2012). Betweenness centrality as a driver of preferential attachment in the evolution of research collaboration networks. Journal of Informetrics, 6, 403–412. https://doi.org/10.1016/j.joi.2012.01.002
Buechel, B., & Buskens, V. (2008). The dynamics of closeness and betweenness.
Ding, Y. (2011). Scientific collaboration and endorsement: Network analysis of coauthorship and citation networks. J Informetr, 5, 187–203. https://doi.org/10.1016/j.joi.2010.10.008
Gonzalez-Brambila, C. N., Veloso, F. M., & Krackhardt, D. (2013). The impact of network embeddedness on research output. Research Policy, 42, 1555–1567. https://doi.org/10.1016/j.respol.2013.07.008
Granovetter, M. S. (1977). The strength of weak ties. In Social networks (pp. 347–367). Elsevier.
Handcock, M. S., Hunter, D. R., Butts, C. T., Goodreau, S. M., & Morris, M. (2003). Statnet: Software tools for the Statistical Modeling of Network Data.
Jackson, M. O., & Wolinsky, A. (1996). A Strategic Model of Social and Economic Networks. Journal of Economic Theory, 71(1), 44–74. https://doi.org/10.1006/jeth.1996.0108
Lee, N. H. (2014). The Search for an Abortionist: The Classic Study of How American Women Coped with Unwanted Pregnancy before Roe v. Wade. Open Road Media.
McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a Feather: Homophily in Social Networks. Annual Review of Sociology, 27, 415–444. https://doi.org/10.1146/annurev.soc.27.1.415
Team, R. C. (2017). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.